A note on variable susceptibility, the herd-immunity threshold and modeling of infectious diseases

The unfolding of the COVID-19 pandemic has been very difficult to predict using mathematical models for infectious diseases. While it has been demonstrated that variations in susceptibility have a damping effect on key quantities such as the incidence peak, the herd-immunity threshold and the final size of the pandemic, this complex phenomenon is almost impossible to measure or quantify, and it remains unclear how to incorporate it for modeling and prediction. In this work we show that, from a modeling perspective, variability in susceptibility on an individual level is equivalent with a fraction θ of the population having an “artificial” sterilizing immunity. We also derive novel formulas for the herd-immunity threshold and the final size of the pandemic, and show that these values are substantially lower than predicted by the classical formulas, in the presence of variable susceptibility. In the particular case of SARS-CoV-2, there is by now undoubtedly variable susceptibility due to waning immunity from both vaccines and previous infections, and our findings may be used to greatly simplify models. If such variations were also present prior to the first wave, as indicated by a number of studies, these findings can help explain why the magnitude of the initial waves of SARS-CoV-2 was relatively low, compared to what one may have expected based on standard models.

where diag S denotes the diagonal matrix with S(t) on the diagonal and ν denotes (ν 1 , . . . , ν J ) T . Since we assume that susceptibility and infectivity are uncorrelated, the total amount of new infections are distributed over the various I k 's in proportion to the fraction of the population that belong to respective group. More precisely, let x k denote the fraction of the total population that would end up in infectivity group I k , if falling ill, and let x = (x 1 , . . . , x K ) T denote the corresponding vector. The corresponding SIR-equation system becomes where ν = T generation ), the basic SIR then depends only on two parameters, ASI θ and transmission rate α (or, equivalently, R 0 ). Hence, as long as we believe that there exist numbers such that the above system (11)-(13) would be a good model for accurately forecasting disease spread, we can instead run a basic SIR-model with only two unknown parameters, which are (more) easily fitted to real data. In conclusion, we do not need to know the actual transmission probabilities p j,k , or the contact rate a. We now begin to reduce these equation to simpler ones.

The initial conditions are irrelevant
Even if ˜x is different from x, it will not remain so for long, for the equation for I in (12) ensures that at any time point the new cases are distributed according to x. Hence it will only take a few generations before the distribution of infected very closely resembles that of x. This will happen in the very early stages of the disease progression, and therefore we might as well assume thatx = x from the beginning.
More rigorously, we can argue as follows; Given the function ν describing the total incidence, the solution for I j is given by by which it follows thatx j soon becomes irrelevant, due to the exponential decay of the corresponding termx j e −σt .

Variable infectivity is irrelevant; S-SIR
We now reduce the above equations to the Susceptibility-stratified model (S-SIR) used in the main text. Recall that I denotes the vector of all infectivity compartments and I denotes their sum.
Proof. Let (S, I, R) be given and let ν be defined by (11). Given a scalar solution to is a solution to the corresponding row in the equation I = νx − σI. By the uniqueness of such solutions, we conclude that I(t) = xi(t). Hence, since I denotes the sum of I, we see that i = I. With p = M P x it is easy to see that (11) reduces to (14), and that the equation for I in (12) reduces to the middle equation of (15).
Note that the numbers p 1 , . . . , p J in p are between 0 and 1 and can be interpreted as the probability of disease transmission when an "average" infective meets a susceptible in group S 1 , . . . , S J , respectively. We can also compute R 0 given these transmission probabilities via the formula which should be compared with R 0 = αT generation for the basic SIR (see Section 3.1).
The simplest way to derive this is to note that at t = 0 we have S/N = w so the equation for I then reads I (0) = a w, p I − σI. This means that an average infective infects a w, p new individuals every day, which in average goes on for T generation amount of days. Multiplying these two quantities gives (17). The same formula can also be deduced more rigorously using the formal mathematical definition of R 0 as the spectral radius of the next-generation matrix, see e.g. Sec. 5 in [7].
In the S-SIR model (14)-(16), herd-immunity arise when the derivative I (t) becomes negative, i.e. at the point t 0 when the equation a S(t0) N , p − σ = 0. Since it is not possible to theoretically compute the distribution of susceptibles at this point, we can not give a closed formula for the herd-immunity threshold. This underlines the value of the result in the coming section, implying that the value is approximately (2) and (8)). The fact that the value is lower than the classical formula (1) is clearly related to the fact that highly susceptible individuals get infected to a greater extent than the other compartments. However, during a vaccination campaign there will be an equal fraction of immunity in in each respective group, i.e. a constant times w. Letting H IT,vac denote the vaccine induced herd immunity threshold, we thus get the equation

Examples with variable susceptibility
To illustrate, we note that if J = 3, then (14) becomes and now the total amount of newly infected becomes To see why this can not be further reduced to a basic SIR, in a similar way as the proof of Proposition 1.1, note that ν 1 gets withdrawn from S 1 , ν 2 gets withdrawn from S 2 and ν 3 gets withdrawn from S 3 . If individuals in S 1 are much more susceptible than individuals in S 2 , who are much more susceptible than those in S 3 (so p 1 p 2 p 3 ) then the fraction ν 1 (t)/S 1 (t) is much larger than ν 2 (t)/S 2 (t) and ν 3 (t)/S 3 (t), which means that the group of "super-susceptibles" S 1 will become depleted faster. But if p 1 p 2 p 3 then the super-susceptibles contribute to the main part of the coefficient in front of I(t) in (19), and hence it turns out that the disease starts to recede when S 1 becomes sufficiently small, way ahead of what is expected from the standard (homogenous) SIR-model. In other words, while a standard SIR model needs to get above the classical herd-immunity threshold 1 − 1/R 0 before it starts to die out, the model here can die out as soon as the super-susceptibles are depleted. Thus, once the super-susceptibles are depleted, the disease can die out by itself at fractions much lower than predicted by the classical HIT.
This phenomenon is clearly visible in Figure 2 in the main text (dashed curves). For that experiment, we assumed that S 1 consists of the super-susceptibles, for which p 1 = 1, and further that these constitute 30%, i.e. w 1 = 0.3. Similarly we assumed that the average group S 2 constitutes 60% and that only one in 10 contacts leads to disease transmission, i.e. p 2 = 0.1. Finally we supposed that the remaining 10% are well-protected with p 3 = 0.02. Clearly, the group of super-suseptibles have a huge attack rate of around 80% (compared with 72% for the whole population in the standard SIR using the same R 0 = 1.66), but as a consequence, the other two groups have attack rates of around 15% and 5%, respectively. This illustrates why traditional estimates for Herd-Immunity Threshold may be overly pessimistic.
To underline that the behavior of S-SIR seen in Figure 1 and 2 is representative, we provide a second example with only 10% super-spreaders and 30% well protected (i.e. we flip the amount in group S 1 with the amount in group S 3 ). The result is displayed in Supp. Fig. 1. As is plain to see, not much changes; the tiny group of super-susceptibles still "protects" the rest, in the sense that once they are depleted the disease dies out naturally. This group now has a 92% attack rate, as opposed to previously 80% in Fig.   2. Looking at the graph of recovered, we also see that it looks almost as either of the curves in Fig. 1a, only the final size of the pandemic changes to 23%, from 33% for the previous S-SIR and 72% for standard SIR, respectively. Intuitively, it should be possible to find a setting of parameters where the disease burns out among the super-susceptibles, but lives on a bit longer in the "average" group. However, the main finding of this paper is that this intuition is wrong, as we establish in the next section.
Supp. Fig 1. Comparison graphs of R and S for alternative compartment sizes. (a) Graphs of recovered as fraction of the population for SIR and S-SIR with a fixed value of R 0 = 1.66, along with estimates for the Herd-Immunity Threshold. As expected, S-SIR reaches a final size of the pandemic way below the classical HIT, but about twice that of the HIT as computed according to formula (2) suggested in this paper. (b) Corresponding graphs for susceptibles, which should be compared with Figure 2 in the main text. The model assumes 10% super-spreaders and 30% well protected, as opposed to the situation with 30% super-spreaders and 10% well protected depicted in Figure 2.
We now solve the equation for r 0 (t) in (20), which is separable and can be written dr 0 /dt σĩ 0 (r 0 (t)) = 1.
Integrating both sides with respect to t and making the change of variables x = r 0 (t) gives Letting t 0 be the primitive function of 1/(σĩ 0 ) satisfying t 0 (0) = 0 we thus obtain the implicit solution t 0 (r 0 (t)) = t. Hence, t 0 is also the inverse of r 0 (which exists since r 0 (t) > 0 for all t > 0), and so By differentiating (23) we see thatĩ 0 is concave. Let r ∞ be the positive solution of the equationĩ 0 (r) = 0, which thus is unique. By standard calculus this gives rise to a non-integrable singularity in (24), and hence lim r→r∞ t 0 (r) = ∞. Since r 0 is the inverse of t 0 , we see that r ∞ equals the limit lim t→∞ r 0 (t), known as the final size of the pandemic. We therefore refer to this number simply as r 0 (∞) in what follows.
Applying the same arguments to (14)-(16) we obtain first a system in terms of and then an equivalent reduced system of the form Let (s 1 , . . . ,s J ,ĩ 1 ) be a solution to (27) withĩ 1 (0) = ε ands j (0) = S j (0)/N = w j for j = 1, . . . , J. We will write r 1 (t) to denote the function described by (24) withĩ 0 swapped forĩ 1 . As in (22)-(23) we obtain the solutions See Supp. Fig. 2 (black curve) for an illustration (using the same values as in Figure 1-2 in the main text). In particular note that it meets the x-axis slightly above 0.3, which coincides with the final size of the pandemic seen for the black graph in Figure 1 (main text).
The philosophy behind formula (8), is that we want to pick the parameters α and ω so that (23) becomes approximately equal to (29). Then, due to (24) we will have r 1 (t) approximately equal to r 0 (t) and therefore i 1 (r 1 (t)) approximately equal to i 0 (r 0 (t)).

Now, by Taylor's formula we havẽ
(with strict inequality since at least one p j is strictly less than one). These formulas can be put in somewhat neater form by introducing R 1 = a/σ, i.e. the R 0 -value one would have at the beginning of the pandemic if everyone were a super-susceptible. We then get which should be compared with (23). Note that F 1 is engineered to have same first three Taylor coefficients as F 0 . Moreover, introducing we obtain r 1 as t −1 1 and subsequently r 1 = σi 1 .
To summarize this section, we argue that if F 0 ≈ F 1 and r 0 ≈ r 1 , then it should follow that i 0 ≈ i 1 since i j (t) =ĩ j (r j (t)), j = 0, 1, andĩ 0 andĩ 1 are given by (23) and (31), respectively. To formalize this statement a bit, consider now the more general ODE where Then r 0 (t) is the unique solution to (33) for δ = 0, while r 1 (t) is the unique solution to (33) for δ = 1. Since G(x, δ) is independent of t and continuous in δ for every fixed x, it follows that the solution r(t) = r(t, δ) of (33) is continuous in δ, uniformly in t ∈ [0, T ] for any T > 0, see for example [8] and the references therein. We collect these observations in the following theorem.  (8)) equal to 64%, 45% and 25%. Note that basic SIR with ASI becomes a special case of S-SIR if we choose a function p that is either 0 or 1 with a sharp drop at 100ω. These curves are also seen in the embedded pictures.
Supp. Fig 3. p-curves. Top: Curves displaying the probability of transmission when a susceptible person from group j meets an infective. The blue curves correspond to different ρ-values and the black curves display the corresponding binary immunity curves with ASI obtained from (8), yielding 64%, 45% and 25%, respectively. Bottom: corresponding waves of prevalence for S-SIR and SIR with ASI, (using parameters obtained from (8); an even more perfect overlap is possible with free parameters α and ω for SIR with ASI, similar to Figure 1 in the main text.) Supp. Fig 4. Envelope foliation. Line between pairs (r 0 (t), i 0 (t)) and (r 1 (t), i 1 (t)) for a grid of t-values, using the same parameters as in Fig. 2 for S-SIR and parameters from (8) for SIR. Note that the inner and outer envelope is given byĩ 0 andĩ 1 (cf. Fig.  2), and that the slopes are decreasing but positive.

Further results and a conjecture
It remains to establish proper estimates proving that the resulting solution (i 0 , r 0 ) is near (i 1 , r 1 ). We first establish the following important inequality, which is the explanation behind that the black curve (from S-SIR) always seems to be a bit higher than the blue one (from SIR).
Moreover we have Multiplying with e αr/σ we obtain J j=1 w j ap j e (α−apj )r/σ − ωα which is a strictly convex function (unless α = ap j for all j, which means that all probabilities are the same and the model collapses to the standard SIR model, and there is nothing to prove) that in addition equals zero and has derivative zero at r = 0 by construction. By convexity it follows that the function is non-negative, giving dĩ 1 /dr ≥ dĩ 0 /dr with strict inequality for r > 0. Convexity also implies that the function and its derivative are increasing. The desired inequality follows by integrating dĩ 1 /dr ≥ dĩ 0 /dr, keeping in mind thatĩ 1 (0) =ĩ 0 (0). Proof. Let g(x) = σĩ 0 (x). Then r 0 (t) = g(r 0 (t)) while r 1 (t) ≥ g(r 1 (t)) by Proposition 2.2. Since r 1 (0) = r 0 (0), standard theory of ordinary differential equations then dictates that r 1 (t) ≥ r 0 (t) for t ≥ 0.
However, we believe there is more to say than what is captured in the above two propositions. Supp. Fig. 4 gives an illustration of how the pairs (i 0 (t), r 0 (t)) and (i 1 (t), r 1 (t)) evolve with time. Note how the distance seems to be gradually increasing and always points in the north-east direction. We have not been able to prove this nor the closely related following observation, which in a more concrete way would imply that i 1 and i 0 are close. Conjecture 2.4. We have i 1 ≥ i 0 and furthermore, s 0 − J j=1 s j is an increasing function.